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FOREWORD 


The  Armed  Services  Vocational  Aptitude  Battery  Is  a  multi aptitude  test 
battery  used  for  selection  and  classification  of  United  States  military 
personnel.  Major  Army  research  efforts  are  underway  which  are  directed  at 
relating  scores  achieved  by  enlisted  accessions  on  this  test  to  performance 
and  success  in  training  and  on  the  job.  The  purpose  of  this  research  was  to 
verify  the  reliability  of  the  reported  scores  to  ensure  that  validation 
research  was  grounded  on  accurate  ASVAB  test  scores.  The  results  of  this 
research  verify  that  scoring  and  reporting  of  the  ASVAB  results  at  MEPCOM 
installations  is  reasonably  accurate. 

Research  also  was  conducted  to  examine  the  test  score  changes  for 
applicants  who  failed  to  achieve  the  required  ASVAB  cut  scores  for 
enlistment.  The. results  of  this  research  Indicate  that  these  applicants 
showed  greatest  change  In  the  speeded  subtests  of  the  ASVAB. 

This  research  was  carried  out  under  contract  by  RESEARCH  APPLICATIONS, 
INCORPORATED  of  Rockville,  Maryland  under  the  direction  of  the  Selection  and 
Classification  Technical  Area  In  response  to  the  requirements  of  Army  Project 
No.  2Q263731A792. 


STUOY  OF  THE  RELIABILITY  OF  SCORES  FOR  FISCAL  YEAR  1981  ARMY  APPLICANTS: 
ARMED  SERVICES  VOCATIONAL  APTITUDE  BATTERY  FORMS  8.  9  AND  10 


BRIEF 

REQUIREMENTS: 

To  assess  the  accuracy  of  Armed  Services  Vocational  Aptitude  Battery 
(ASVAB)  subtest  scores  as  reported  by  Military  Enlistment  Processing  Command 
(MEPCOM)  Military  Entrance  Processing  Stations  (MEPS),  and  the  contracted 
Mobile  Examining  Test  (MET)  sites  for  purposes  of  establishing  a  reliable 
FY  1981  Army  applicant  data  base. 

PROCEDURES: 

Answer  sheets  completed  by  Initial  test  applicants  for  the  U.S.  Army 
were  rescored  by  an  Independent  contractor.  The  scores  reported  by  the  MEPS 
for  each  subtest  of  the  ASVAB  were  compared  to  the  scores  computed  for  each 
subtest  of  the  ASVAB  by  the  Independent  contractor.  Also,  an  analysis  of 
test-retest  scores  achieved  by  the  Army  applicants  was  conducted  using  ASVAB 
score  data  reported  by  the  MEPS. 

FINDINGS: 

More  than  143,000  Army  applicants  had  matching  MEPS  and  contractor- 
scored  ASVAB  data.  A  subtest  comparison  of  test  scores  Indicated  that  the 
mear  of  six  of  the  ten  subtest  scores  reported  by  the  MEPS  differed  from 
those  computed  by  the  contractor.  However,  computations  of  the  AFQT  and  Army 
Combat  composite  Indicated  agreement  In  the  classification  of  applicants  in 
approximately  97*  and  94*  of  the  cases,  respectively.  The  analyses  of  the 
data  for  those  applicants  who  took  the  ASVAB  twice  showed  that  retesting 
raised  speeded  test  scores  achieved  by  this  group  to  the  level  of  those 
applicants  who  did  not  retest.  There  was  little  or  no  change  in  the  scores 
for  the  power  tests. 

UTILIZATION: 

The  results  of  the  test  score  comparison  verify  the  accuracy  of  scores 
reported  by  the  MEPS.  The  retested  applicants  Improved  their  scores  on  the 
two  speeded  tests,  and  on  two  of  the  eight  power  tests.  However,  all  power 
subtest  scores  remained  significantly  lower  than  those  achieved  by  one-time 
ASVAB  examinees. 
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INTRODUCTION 


The  U.S.  Army,  along  with  the  other  major  branches  of  the  military 
service,  uses  the  Armed  Services  Vocational  Aptitude  Battery  (ASVAB)  for 
selection  and  classification  of  enlistees.  A  long-term  major  research  effort 
is  being  initiated  by  the  Army  to  relate  the  scores  achieved  on  this  test  to 
performance  in  training  and  on  the  job. 

The  ASVAB  tests  are  administered  throughout  the  United  States  through  a 
large  testing  network.  The  network  consists  of  68  Military  Entrance 
Processing  Stations  (MEPS)  and  numerous  satellite  testing  locations,  or 
Mobile  Examining  Test  Sites  (METS).  The  test  scores  are  computed  at  the  MEPS 
and  forwarded  to  a  central  registry  to  develop  a  record  for  each  applicant. 

In  FY  1981,  there  were  more  than  490,000  applicants  for  the  Army.  The 
accuracy  of  the  scoring  of  the  ASVAB  at  the  MEPS  is  of  great  interest  to  the 
Army,  because  of  the  need  to  have  reliable  ASVAB  score  data  on  the  FY  1981 
accession  cohort.  To  evaluate  the  accuracy  of  ASVAB  test  scores  reported  by 
the  MEPS  required  rescreening  the  original  ASVAB  answer  sheets  and  comparison 
of  the  two  sets  of  scores. 

Included  in  the  FY  1981  applicant  pool  were  almost  30,000  individuals 
who  failed  to  qualify  for  enlistment  based  on  their  initial  scores  on  the 
ASVAB  and  who  were  retested.  Data  were  available  from  MEPS  files  to  examine 
the  changes  in  their  test  scores  as  a  result  of  repeated  administration  of 
the  ASVAB  and  to  compare  these  changes  with  those  scores  achieved  by  single 
administration  applicants. 

The  work  conducted  in  this  research,  therefore,  was  carried  out  in  two 
concurrent  efforts.  The  first  effort  was  designed  to  yield  information  about 
the  reliability  of  the  ASVAB  test  scores  as  reported  by  the  MEPS.  The  second 
effort  was  designed  to  examine  the  effects  of  retesting  on  scores  achieved  by 
applicants  who  failed  to  attain  a  minimally  acceptable  score  for  enlistment 
on  previous  test  administrations.  The  methods,  procedures  and  results  of  the 
data  analyses  followed  in  each  effort  are  described  in  turn  in  the  remainder 
of  this  report. 


PART  I.  RELIABILITY  OF  MEPS-REPORTED  ASVAB  SCORES 
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The  ASVAB  consists  of  ten  subtests  administered  during  a  two-hour  and 
forty-five  minute  session  to  screen  applicants  for  military  service.  Each  of 
the  major  branches  of  the  service  use  a  composite  of  four  of  the  subtests. 
Word  Knowledge  (WK),  Arithmetic  Reasoning  (AR),  Paragraph  Comprehension  (PC) 
and  Numerical  Operations  (NO)  as  a  minimum  criterion  for  acceptance.  This 
composite  is  known  as  the  Armed  Forces  Qualifying  Test  (AFQT). 

The  remaining  six  subtests  of  the  ASVAB  are  combined  in  various  ways 
with  some  of  the  AFQT  subtests  to  form  composites  of  specific  Interest  to 
branches  of  the  military  for  initial  classification  of  the  applicants.  The 
Combat  Composite  (CO)  consisting  of  the  Coding  Speed  (CS),  Mechnlcal 
Comprehension  (MC)  and  Auto/Shop  Information  (AS)  subtests,  and  the 
Electronics  Composite  (EL)  consisting  of  the  General  Science  (GS), 

Mathematics  Knowledge  (MK)  and  Electronics  Information  (El)  subtests  are  two 
of  those  which  are  used  by  the  Army  to  further  classify  applicants  for 
initial  assignment.  Both  the  raw  subtest  scores  and  the  scores  for  the  AFQT, 
CO  and  EL  composites  were  compared  using  a  large  number  of  original  test 
answer  sheets  sent  to  the  Army  by  the  MEPS. 

METHOD 

In  fiscal  year  1981,  more  than  198,000  Army  applicants  who  took  the 
ASVAB  on  one  occasion  only  were  Identified  for  the  study  through  submission 
of  original  test  responses  by  the  MEPS. 

To  prepare  the  test  answer  sheets  for  scanning,  the  project  staff  first 
performed  a  sorting  routine  In  which  the  answer  sheets  designated  for 
services  other  than  the  Army,  those  for  the  Army  National  Guard  and  Reserves, 
and  also  the  Am \y  retests  and  verifications,  were  separated  from  the  Active 
Army  initial  test  responses. 

It  was  expected  that  the  MEPS  would  provide  completed  answer  sheets 
covering  all  of  the  months  from  October  1980  through  September  1981.  It 
was  found  that  for  a  majority  of  the  MEPS,  answer  sheets  for  some  of  these 
months  were  missing.  In  most  cases,  the  answer  sheets  missing  were  those  for 
the  months  before  April  1981.  The  fact  that  answer  sheets  for  several  months 
of  testing  would  not  be  available  for  study  resulted  in  ARI's  decision  to 
abandon  plans  for  analysis  of  these  data  by  month  of  testing. 

In  a  number  of  cases,  social  security  numbers  and  other  demographic  and 
identification  data  were  missing  from  the  answer  sheets.  Where  possible, 
these  data  were  copied  from  the  computer  sheets  attached  to  the  answer 
sheets.  Where  computer  sheets  were  not  available,  the  ARI  applicant  file  was 
used  to  categorize  the  examinee.  However,  since  the  data  were  merged  with 
the  ARI  applicant  file  by  social  security  number,  missing  or  incompletely 
filled  in  social  security  numbers  resulted  in  some  loss  of  data. 
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Other  losses  occurred  in  the  data  preparation  process.  Many  of  the  MEPS 
had  sent  answer  sheets  with  the  pages  stapled  together.  The  optical  scanning 
machines  tended  to  reject  answer  sheets  with  staple  holes  in  the  area  of  the 
sequence  number. 

Preprocessing  computer  programs  eliminated  15,214  records  with  bad 
social  security  numbers  and  224  data  sets  with  bad  test  versions.  In  all, 
183,413  ASVAB  data  sets  were  scored;  and  149,825  one-time  only  records  were 
successfully  merged  with  the  ARI  applicant  file.  Given  the  estimate  of 
490,000  Army  applicants,  the  number  of  records  matched  by  the  contractor 
represented  31*  of  the  total  applicant  data  file  provided  by  the  MEPS. 

ANALYSES 

A  series  of  analyses  was  conducted  on  the  Initial  test  takers  only  Army 
applicant  pool  to  examine  the  reliability  of  the  MEPS  reported  tested  scores. 
The  applicant  data  pool  was  Initially  screened  to  eliminate  those  Individuals 
with  out-of-range  or  missing  scores  from  any  one  of  the  ten  subtests.  The 
applications  of  this  procedure  yielded  a  final  applicant  pool  of  143,279  for 
analysis. 

RESULTS 

Using  the  responses  from  the  Initial  test  only  data  base,  the  mean  and 
standard  deviation  of  each  of  the  subtest  scores  were  computed  for  all 
matched  applicant  data  sets.  A  comparison  of  the  differences  between  the 
means  indicated  that,  although  for  six  of  the  ten  subtests  of  the  ASVAB 
significant  differences  were  detected  (see  Table  1),  only  the  mean  difference 
in  scores  for  Coding  Speed  (CS)  appeared  large  enough  to  require  further 
examination.  The  comparative  score  data  revealed  that  the  greatest  number  of 
scores  which  did  not  match  was  In  the  CS  subtest.  A  count  was  made  of  the 
number  of  MEPS  by  percent  of  matching  CS  scores  for  males.  Only  one  MEPS  had 
a  100*  match;  but  there  were  only  17  cases  In  the  data  base  for  this  MEPS. 

In  general,  most  of  the  MEPS  (62  in  all)  had  matched  CS  sub test  scores  for 
males  for  between  70*  and  90*  of  the  male  cases.  Almost  half  of  the  MEPS 
(29)  had  matched  scores  for  between  80*  and  84*  of  the  cases. 

The  results  of  a  factor  analysis  of  the  subtest  scores  achieved  by  this 
group  were  consistent  with  findings  from  other  factor  analyses  research  of 
ASVAB  test  scores.  Two  factors,  power  and  speed,  emerged  for  the  analyses. 

The  AFQT,  CO  and  EL  composites  were  computed  to  examine  the  differences 
In  classification  of  initial  test  only  based  on  contractor  and  MEPS  reported 
data. 

The  AFQT  composite  score  was  computed  using  MEPS  reported  and  contractor 
compiled  scores  for  the  matched  groups  of  applicants.  The  AFQT  consists  of 
the  sum  of  the  scores  obtained  on  the  AR,  WK  and  PC  subtests  and  one-half  of 
the  score  obtained  on  the  NO  subtest.  The  AFQT  raw  score  composites  were 


TABLE  1.  Comparison  of  MEPCOM  and  Contractor  Scored  Sample  Subtest  Means 


Subtest  Name 

MEPCOM 

CONTRACTOR 

Number  of 
Items 

X 

sd 

X 

sd 

z 

General  Science  (GS) 

25 

13.906 

5.323 

13.893 

5.319 

0.654 

Arithmetic  Reasoning  (AR) 

30 

15.974 

6.879 

15.954 

6.875 

0.779 

Word  Knowledge  (WK) 

35 

22.433 

8.129 

22.307 

8.148 

4.154 

Paragraph  Comprehension  (PC) 

15 

9.368 

3.586 

9.331 

3.604 

2.758 

Numerical  Operations  (NO) 

50 

33.433 

10.634 

33.319 

10.708 

2.860 

Coding  Speed  (CS) 

84 

41.983 

15.052 

41.695 

15.013 

5.128 

Auto/Shop  Information  (AS) 

25 

14.486 

5.705 

14.433 

5.724 

2.482 

Mathematics  Knowledge  (MK) 

25 

10.956 

5.218 

10.932 

5.216 

1.231 

Mechanical  Comprehension  (MC) 

25 

13.321 

5.184 

13.292 

5.188 

0.474 

Electronics  Information  (El) 

20 

10.957 

4.063 

10.905 

4.073 

3.430 

N  =  143,279 


converted  to  the  categories  used  by  the  Armed  Forces  for  enlistment.  The 
number  of  applicants  was  tabulated  by  AFQT  category  for  each  AFQT  composite 
(see  Table  2).  The  breakdowns  for  the  AFQT  categories  are  as  follows: 

AFQT  Category  AFQT  Raw  Score  Range 

I  101-105 

II  84-100 

IIIA  76-  83 

I I IB  65-  75 

IVA  56-  64 

IVB  49-  55 

IVC  38-  48 

V  0-  37 

The  number  of  applicants  who  changed  from  one  AFQT  category  to  another  based 

on  these  computations  was  seen  as  an  Indication  of  the  error  associated  with 
scoring  these  subtests.  Since  the  shift  in  the  applicants'  AFQT  category 
could  be  the  result  of  errors  made  by  both  the  MEPS  and  the  contractor  in 
scoring  the  test,  it  was  determined  that  any  estimate  of  the  error  in 
classification  should  be  adjusted  empirically.  In  this  case,  the  number  of 
applicants  one  cell  to  the  left  and  right  of  the  category  on  the  diagonal 
would  be  used  to  estimate  the  error  of  assignment. 

A  simple  difference  between  these  two  values  was  computed  and  an  average 
error  rate  was  estimated  for  cell  categories.  There  appears  to  be  an 
estimated  error  rate  of  +1.37*  by  the  MEPS  in  assigning  applicants  to  mental 
categories.  Based  on  the  AFQT  scores  of  490,000  Army  applicants  who  were 
tested  during  FY  1981,  this  error  rate  would  translate  to  a  little  more  than 
6,700  applicants. 

A  similar  error  rate  analysis  was  performed  using  transformed  subtest 
scores  which  constitute  the  CO  and  EL  composites.  The  CO  composite  consists 
of  a  sum  of  the  transformed  AR,  CS,  MC  and  AS  subtest  scores  (see  Table  3). 
The  average  error  rate  computed  for  this  composite  was  ±2.10*.  Again,  for 
the  estimated  total  of  490,000  applicants  this  error  rate  translates  into 
approximately  10,300  persons.  The  EL  composite  consists  of  a  sum  of  the 
transformed  GS,  AR,  MK  and  El  subtests  scores  (see  Table  4).  The  average 
error  rate  computed  for  this  composite  was  ±2.14*.  This  error  rate 
translates  into  approximately  10,500  persons  for  this  composite. 

DISCUSSION 

The  results  of  the  comparison  of  the  MEPS-reported  ASVAB  scores  and  the 
contractor  computed  ASVAB  scores  showed  that  there  were  few  discrepancies 
found.  As  with  previous  efforts  conducted  by  ARI,  on  smaller  samples  of 
applicants,  the  greatest  number  of  disparate  score  comparisons  was  Identified 
with  the  CS  subtest.  The  differences  In  score  reporting  for  this  subtest  may 
be  attributable  to  factors  such  as  mistiming  and  mlsscoring  of  the  Items. 
Furthermore,  the  error  rate  In  categorizing  applicants  by  composite  also  was 
found  to  be  minimal  (see  Tables  5,  6  and  7). 
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TABLE  2.  Comparison  of  AFQT  Category  Assignment  Based  on  AFQT  Scores  Computed 
by  Contractor  and  Reported  by  HEPS:  FY  1981  Applicants.* 


Reported 


Composite 

Category 


_ AFQT  Raw  Score  Range 

101-105  84-100  76-83  65-75  56-64  49- 


%  of  Total 


3,137 

85 

7 

2 

26 

26,964 

310 

144 

0 

73 

16,327 

361 

2 

15 

64 

25,464 

361  53 

464  434 

110  19,015 
7  127 

7  23 


53  17  14 
434  70  25 
015  355  91 
127  14,415  378 
23  144  18,560 


127  15,590 


65  27,154  16,720  26,099  19,683  15,028  19,220  16,205  143,279 


TABLE  3.  Comparison  of  Urmy  Combat  (CO)  Composite  Category  Assignments  Based  on 

Scores  Computed  by  Contractor  and  Reported  by  MEPS:  FY  1981  Applicants.* 


iXiftj 


S 


VEPS 

Reported 

Score 

Contractor  Scored  Composite  Category 

Composite 

Category 

120+ 

110-119 

105-109 

100-104 

95-99 

90-94 

85-89 

80-84 

40-79 

Total 

120+ 

11,872 

291 

23 

25 

12 

17 

12 

6 

19 

12,277 

110-119 

81 

15,586 

333 

118 

30 

19 

16 

18 

46 

16,247 

105-109 

12 

84 

8,532 

299 

33 

34 

13 

9 

26 

9,042 

100-104 

12 

30 

116 

13,492 

338 

72 

32 

16 

70 

14,178 

95-99 

4 

13 

23 

129 

11,398 

371 

70 

27 

60 

12,095 

90-94 

6 

12 

4 

34 

112 

11,585 

388 

67 

93 

12,301 

85-89 

4 

4 

4 

12 

27 

147 

12,192 

374 

159 

12,923 

80-84 

1 

2 

6 

4 

16 

31 

128 

10,301 

481 

10,970 

40-79 

4 

9 

7 

18 

23 

30 

81 

208 

42,861 

43,241 

Total 

11,996 

16,034 

9,048 

14,133 

11,989 

12,306 

12,932 

11,026 

43,815 

143,279 

%  of  Total 

8.4 

11.2 

6.3 

9.9 

8.4 

8.6 

9.0 

7.7 

30.5 

100.0 

*C0  *  AR  +  AS  +  MC  +  CS 
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TABLE  4.  Comparison  of  Army  Electronics  (EL)  Composite  Category  Assignments  Based 

on  Scores  Computed  by  Contractor  and  Reported  by  MEPS:  FY  1981  Applicants.* 


«PS 

Reported 

Score 

Composite 


Contractor  Scored  Composite  Category 


TABLE  5.  Rescoring  Shifts  In  AFQT  Assigned  Mental  Category  in  Percents  for 
MEPCOM  and  Contractor  Scored  FY  1981  Army  Applicants*,  N  =  143,279. 


MEPS 

Reported 

Score 

Mental 

Category 

Contractor  Computed  Score  Mental 

Category 

V 

IVC 

IVB 

IVA 

1 1  IB 

II IA 

II&I 

I&II 

0 

0 

0 

0 

1 

2 

100 

1 1 1 A 

0 

0 

0 

0 

1 

98 

0 

1 1  IB 

0 

0 

0 

2 

98 

0 

0 

IVA 

0 

0 

2 

97 

0 

0 

0 

IVB 

0 

2 

96 

1 

0 

0 

0 

IVC 

2 

97 

1 

1 

0 

0 

0 

V 

96 

1 

0 

0 

0 

0 

0 

%  of  Total 

11 

13 

11 

14 

18 

12 

21 

TABLE  6.  Rescoring  Shifts  In  CO  Composite  In  Percents  for  MEPCOM  and  Contractor 
Scored  FY  1981  Army  Applicants*,  N  *  143,279. 


MEPS 

Contractor  Computed  Score 

Reported 

Score 

120+ 

110-119 

105-109 

100-104 

95-99  90 

-94 

85-89 

80-84 

40-79 

120+ 

99 

2 

0 

0 

0 

0 

0 

0 

0 

110-119 

1 

97 

4 

1 

0 

0 

0 

0 

0 

105-109 

0 

1 

94 

2 

0 

0 

0 

0 

0 

100-104 

0 

0 

1 

95 

3 

1 

0 

0 

0 

95-99 

0 

0 

0 

1 

95 

3 

1 

0 

0 

90-94 

0 

0 

0 

0 

1 

94 

3 

1 

0 

85-89 

0 

0 

0 

0 

0 

1 

94 

3 

0 

80-84 

0 

0 

0 

0 

0 

0 

1 

93 

1 

40-79 

0 

0 

0 

0 

0 

0 

1 

2 

38 

%  of  Total 

**  8 

11 

10 

8 

9 

9 

8 

30 

*CO  =  AR  +  AS  +  MC  +  CS 

**May  not  total  100  due  to  rounding 


PART  II.  RETEST  APPLICANT  STUDY 


More  than  36,000  applicants  were  Identified  who  had  two  sets  of  ASVAB 
scores  reported  on  their  record.  Indicating  that  they  may  have  taken  the 
battery  on  more  than  one  occasion.  A  number  of  those  Individuals,  however, 
were  found  to  have  been  given  a  verification  form  of  the  ASVAB  as  a  result  of 
scoring  Inconsistencies  or  suspect  test  taking.  These  Individuals  were 
removed  from  the  larger  group  who  were  considered  to  be  retested  applicants. 
Furthermore,  applicants  whose  subtest  scores  on  all  ten  subtests  were  found 
to  be  the  same  for  the  two  sets  of  scores  reported  were  dropped  from  the  data 
base.  Finally,  if  the  sum  of  the  AFQT  composite  exceeded  the  raw  score 
maximum,  these  applicants  were  dropped  from  further  analyses.  From  the 
original  pool  of  36,000  applicants,  27,911  were  identified  as  having  been 
retested. 

METHOD 

A  series  of  analyses  was  conducted  on  the  retest  army  applicant  pool  to 
examine  the  differences  in  the  test  scores  of  those  applicants  whose  records 
Indicated  that  they  had  taken  the  ASVAB  more  than  once.  The  two  most  recent 
scores  were  used  for  comparison  purposes. 

The  initial  and  retest  scores  were  factor  analyzed.  Mean  and  standard 
deviations  of  the  initial  and  retest  scores  were  compared  for  each  subtest. 
Comparisons  of  the  AFQT,  CO  and  EL  composite  scores  for  retested  applicants 
were  made. 

RESULTS 

A  tabulation  of  the  number  of  individuals  who  took  the  ASVAB  on  two 
different  occasions  is  shown  in  Table  8.  As  can  be  seen  from  this  table, 
1,774  (6%)  of  the  individuals  appeared  to  have  taken  the  same  version  as  both 
initial  and  retests.  Given  this  number,  it  was  determined  that  further 
analyses  of  the  retest  data  would  be  conducted  separately  for  this  group  of 
individuals. 

A  comparison  of  the  mean  subtest  scores  for  all  retested  applicants 
yielded  the  same  results  for  both  groups;  that  is,  those  who  took  the  same 
version  of  the  ASVAB  as  a  retest  and  those  who  took  different  versions  of  the 
ASVAB  as  a  retest  (see  Tables  9  and  10).  A  graph  of  the  frequency 
distributions  of  the  retest  scores  for  the  Word  Knowledge  subtest  shows 
similar  distributions  for  both  groups  (see  Figures  1  and  2).  Further  support 
for  these  results  are  demonstrated  in  the  graphs  of  initial  and  retest  mean 
scores  and  plots  of  the  reciprocal  of  the  standard  deviation  of  each  mean 
retest  score  for  the  Word  Knowledge  subtest  presented  in  Figures  3  and  4, 
respectively.1 

Graphic  distributions  for  all  subtests  may  be  found  in  the  Appendices. 


TABLE  10.  Retest  Applicants  Means  and  Standard  Deviations. 


Different  Versions  of  ASVAB  (N  *  26,137)* 


Subtest 

Initial 

Retest 

Name 

x  sd 

X 

sd 

z 

GS 

10.804 

3.674 

11.104 

3.756 

9.231 

AR 

11.371 

3.788 

11.902 

4.162 

15.254 

WK 

17.512 

5.405 

17.982 

5.675 

9.696 

PC 

7.365 

2.680 

7.771 

2.812 

16.897 

NO 

29.407 

8.784 

32.321 

9.286 

36.856 

CS 

36.446 

13.122 

42.104 

13.382 

48.806 

AS 

11.391 

4.632 

11.882 

4.733 

11.986 

MK 

7.991 

2.919 

8.313 

2.999 

12.439 

MC 

10.373 

3.872 

11.076 

4.096 

20.164 

El 

8.736 

3.153 

9.102 

3.172 

13.230 

AFQT  Composite 

51.200 

10.524 

54.063 

11.967 

29.044 

CO  Composite 

168.779 

18.694 

175.142 

19.816 

37.761 

EL  Composite 

164.499 

17.152 

167.237 

18.373 

17.611 

Same  Version  of  ASVAB 

(N  =  1,774) 

Subtest 

Initial 

Retest 

Name 

X 

sd 

X 

sd 

z 

GS 

10.512 

4.102 

11.309 

4.194 

5.722 

AR 

11.091 

4.346 

12.043 

4.916 

6.111 

WK 

16.776 

6.397 

18.015 

6.584 

5.685 

PC 

6.994 

2.951 

7.759 

3.165 

7.446 

NO 

28.267 

9.686 

31.532 

9.812 

9.974 

CS 

35.032 

14.288 

40.567 

14.338 

11.517 

AS 

10.997 

4.710 

11.890 

4.804 

5.591 

MK 

7.962 

3.371 

8.499 

3.540 

4.627 

MC 

10.083 

4.048 

11.090 

4.237 

7.238 

El 

8.433 

3.360 

9.238 

3.509 

6.979 

AFQT  Composite 

49.246 

13.920 

53.836 

14.787 

9.520 

CO  Composite 

166.271 

21.713 

174.421 

22.485 

10.933 

EL  Composite 

162.742 

21.038 

168.427 

22.514 

7.771 

*3,008  took  same  non-AFQT  portion  of  the  ASVAB 
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As  can  be  seen  from  Table  9,  there  Is  a  significant  increase  in  the  mean 
subtest  scores  between  the  Initial  and  retest  administration.  This  Increase 
might  be  attributed  to  the  practice  effects  of  test  taking.  Indeed  the 
largest  Increases  In  mean  scores  were  found  to  be  in  the  two  speeded 
subtests. 

A  comparison  of  the  retest  applicants'  mean  scores  was  made  with  the 
Initial  test  only  applicants'  mean  scores  for  each  subtest  to  examine  the 
differences  which  reflect  actual  numbers  of  Items  that  were  answered 
correctly.  The  mean  scores  for  the  last  previous  and  most  recent  tests  for 
the  retest  applicants  and  the  initial  test  only  applicants  were  converted  to 
percent  correct  for  each  subtest  (see  Table  11).  The  difference  In  percent 
correct  for  initial  test  only  applicants  and  retest  applicants'  scores  was 
computed  for  each  subtest.  The  difference  was  multiplied  by  the  number  of 
items  in  each  subtest  to  generate  an  index  of  the  approximate  difference  in 
actual  number  of  Items  answered  correctly  by  the  initial  only  applicants  as 
compared  to  the  retested  applicants. 

The  two  power  tests  which  showed  the  greatest  difference  in  the  index 
were  AR  and  WK.  There  were  no  differences  in  this  index  for  the  two  speeded 
subtests;  however,  when  the  same  computation  was  made  for  these  two  tests 
using  percent  values  it  was  found  that  a  difference  of  four  and  six  items 
were  answered  correctly  for  NO  and  CS,  respectively.  This  result  tends  to 
support  the  notion  that  practice  effects  were  Influencing  the  scores  of  these 
two  subtests. 

Whereas,  the  previous  comparisons  were  made  to  examine  the  effects  of 
scoring  differences  on  shifts  between  AFQT  categories,  the  data  In  Table  12 
presents  shifts  made  in  AFQT  categories  by  virtue  of  retests  taken  by  this 
group  of  applicants.  As  with  previous  results,  where  the  average  mean  scores 
were  higher  for  the  most  recent  testing  than  for  the  last  previous  testing, 
the  table  shows  In  numbers  the  persons  who  changed  from  a  lower  *AFQT  category 
to  a  higher  one.  Although  the  number  who  shift  appears  to  be  large, 
particularly  In  the  lower  scoring  categories,  this  shift  Is  expected  due  to 
the  construction  of  the  AFQT.  Table  13  shows  similar  results  for  the  Army  CO 
Composite.  The  shift  Into  a  higher  category  is  reflected  In  the  nature  of 
the  structure  of  the  subtests  which  comprise  this  component.  Shifts  to 
higher  categories  are  easier  to  achieve  for  lower  scoring  applicants  because 
the  category  Intervals  are  relatively  short.  Thus,  an  Increase  In  one's 
speediness  on  the  CS  subtest  alone  could  be  sufficient  to  move  the  applicant 
easily  at  least  one  category. 

Table  14  shows  the  shifts  which  occurred  between  categories  for  the  EL 
composite,  a  composite  which  contains  no  speeded  subtests.  As  can  be  seen 
from  this  table,  far  fewer  category  shifts  occur  between  the  Initial  and 
speeded  subtests. 

The  scoring  shifts  in  percents  for  the  AFQT,  CO  and  EL  composites  are 
presented  In  Tables  15,  16  and  17.  As  can  be  seen  from  these  tables,  AFQT  is 
the  most  stable  of  the  composites. 
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TABLE  11.  Comparison  of  Average  Percent  Correct  Responses  and  Standard 
Scores  of  ASVAB  Subtest  Items  for  Initial  and  Retest  Testing 
Army  Applicants  and  All  Army  Applicants  With  Scored  Matched 
Tests  From  Contractor. 


Retested  Applicants 
(N  =  27,911) 

All  Matched 

^proximate  Approximate 

Mntoer  of 
Items  in 

Initial  Retest 

feroent  Renoent 

Amy  Applicants 
(N  =  143,279) 
Percent 

Muter 

Of  Items 
Different 

firmer 

Of  Items 
Different 

SUbtest 

Safest 

Stri. 

Correct 

Stri. 

Correct 

Stri. 

Name 

(1) 

Scare 

(2) 

Soone 

(3) 

Soone 

41 

37.9 

42 

39.7 

39 

50.0 

40 

51.4 

40 

49.1 

43 

51.8 

44 

5B.8 

47 

64.6 

46 

43.4 

49 

50.1 

41 

45.6 

43 

47.5 

43 

32.0 

44 

33.3 

40 

41.5 

42 

44.3 

42  43.7 


Total  Difference 


TABLE  12.  Frequency  of  Initial  and  Retest  Scores  FY  1981  Army  Applicants  by 
AFQT  Mental  Category.* 


Retest 

Scores 

AFQT 

Mental 

Category 

Initial 

Test 

Scores 

AFQT  Mental 

Category 

Total 

I 

II 

II  IA 

1 1  IB 

IVA 

IVB 

IVC 

V 

101-105 

84-100 

76-83 

65-75 

56-64 

49-55 

38-48 

0-37 

I 

8 

12 

0 

3 

1 

0 

0 

4 

28 

II 

4 

127 

55 

19 

24 

7 

13 

17 

266 

II IA 

0 

35 

63 

157 

277 

24 

23 

18 

597 

1 1  IB 

0 

14 

54 

574 

3,079 

579 

191 

49 

4,558 

IVA 

1 

6 

8 

179 

3,497 

2,210 

1,248 

77 

7,226 

IVB 

0 

1 

3 

31 

1,275 

2,247 

2,632 

169 

6,358 

IVC 

0 

2 

3 

10 

337 

1,297 

3,964 

880 

6,493 

V 

0 

2 

0 

4 

54 

123 

983 

1,219 

2,385 

Total 


13 


199 


186 


977  8,562  6,487  9,054  2,433  27,911 


Retest 

Score 


Initial  Score 


Score 

120+ 

110-119 

105-109 

100-104 

95-99 

90-94 

85-89 

80-84 

40-79 

Total 

120+ 

44 

46 

5 

9 

3 

4 

0 

0 

15 

126 

110-119 

12 

77 

68 

145 

71 

40 

26 

10 

28 

477 

105-109 

1 

34 

58 

167 

156 

92 

43 

12 

24 

587 

100-104 

1 

19 

61 

308 

411 

367 

224 

101 

112 

1,604 

95-99 

1 

10 

28 

161 

368 

474 

473 

278 

297 

2,090 

90-94 

0 

3 

5 

67 

271 

516 

681 

617 

786 

2,946 

85-89 

0 

2 

2 

31 

103 

383 

707 

948 

1,851 

4,027 

80-84 

1 

2 

1 

7 

31 

171 

465 

815 

2,607 

4,100 

40-79 

6 

2 

5 

10 

35 

118 

385 

995 

10,397 

11,953 

TABLE  14.  Frequency  of  Initial  and  Retest  Scores  FY  1981  Army  Applicants 
for  EL  Composite.* 


Retest 

Score 


_ Initial  Score _ 

120+  110-119  105-109  100-104  95-99  90-94  85-89  80-84  40-79  Total 


Total 


270 

195 

159 

1,450 

657 

597 

552 

2,758 

649 

792 

1,302 

3,284 

500 

853 

2,151 

3,855 

599 

1,583 

12,609 

15,042 

771 

4,089 

16,886 

27,911 

9.9 

14.7 

60.5 

100.0 

TABLE  15. 

Scoring 

Retest: 

Shifts  in 
FY  1981 

Percents  for  AFQT  Composite  From 
Army  Applicants.* 

Initial  to 

Retest 

AFQT 

Initial 

Test  AFQT  Mental 

Category 

Mental 

Category 

V 

IVB 

IVA 

me 

II  IB 

IIIA 

II 

I 

I 

0 

0 

0 

0 

0 

0 

6 

62 

II 

1 

0 

0 

0 

2 

30 

64 

31 

IIIA 

1 

0 

0 

3 

16 

34 

18 

0 

1 1  IB 

2 

2 

9 

36 

59 

29 

7 

0 

me 

3 

14 

34 

41 

18 

4 

3 

8 

IVA 

7 

29 

35 

15 

3 

2 

1 

0 

IVB 

36 

44 

20 

4 

1 

2 

1 

0 

V 

50 

11 

2 

1 

0 

0 

1 

0 

%  of  Total 

**  g 

32 

23 

31 

4 

1 

1 

0 

TABLE  16.  Scoring  Shifts  in  Percents  for  CO  Composite  from  Initial  to 
Retest:  FY  1981  Army  Applicants.* 


TABLE  17. 

Scori ng 
Retest: 

Shifts  in  Percents  for  EL  Composite 
FY  1981  Army  Applicants.* ** 

from  Initial  to 

Initial  Test  Score 

Retest 

Score 

120+ 

110-119 

105-109 

100-104 

95-99 

90-94 

85-89 

120+ 

66 

14 

3 

1 

0 

0 

0 

110-119 

18 

55 

23 

7 

2 

1 

0 

105-109 

0 

15 

28 

14 

7 

2 

1 

100-104 

4 

4 

23 

33 

22 

11 

3 

95-99 

1 

6 

13 

22 

25 

19 

10 

90-94 

1 

2 

5 

17 

25 

27 

24 

85-89 

1 

2 

2 

3 

10 

19 

23 

80-84 

3 

2 

1 

1 

5 

13 

18 

40-79 

4 

1 

2 

1 

4 

9 

22 

%  of  Total 

**  0 

0 

1 

2 

4 

8 

10 

*EL  =  GS  +  AR  +  MK  +  El 

**May  not  total  100  due  to  rounding 


DISCUSSION 


The  impact  of  this  practice  effect  must  be  considered  in  light  of  the 
potential  for  marginally-qualified  individuals  to  become  eligible  for 
enlistment  in  the  Armed  Services.  The  capability  of  an  individual  to  change 
his  or  her  qualifying  score  by  increasing  the  speed  at  which  the  NO  questions 
are  answered  have  both  positive  and  negative  implications.  On  the  one  hand, 
it  might  be  a  positive  indication  that  the  individual  has  a  willingness  and 
capability  to  learn  repetitive  tasks.  On  the  other  hand,  it  might  be  an 
indication  that  the  individual  has  only  limited  skills,  such  as  the  ability 
to  acquire  speed  in  performing  simple  computations.  Indeed,  a  close 
examination  of  the  retest  means  and  standard  deviations  shows  not  only  an 
Increase  in  the  average  subtest  scores,  but  a  comparable  increase  in  the 
spread  of  the  scores.  Thus,  the  gain  achieved  in  the  speeded  test  scores  may 
be  offset  by  the  loss  of  points  in  the  power  tests. 
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GENERAL  DISCUSSION  AND  RECOMMENDATIONS 


The  two  major  purposes  of  this  project  were  to  examine  the  reliability 
of  the  ASVAB  data  reported  by  the  MEPS;  and  to  examine  the  relationships 
between  the  scores  reported  for  Army  applicants  who  took  the  ASVAB  on  more 
than  one  occasion.  Comparisons  were  made  of  ASVAB  subtest  scores  reported  by 
Army  applicants  from  a  pool  of  individuals  who  applied  for  military  service 
during  fiscal  year  1981.  In  addition,  test-retest  scores  were  compared  for 
approximately  30,000  applicants. 

The  MEPS  reported  scores  were  found  to  be  highly  reliable  with  the 
exception  of  those  for  the  CS  subtest.  This  result  is  consistent  with  other 
similar  research  conducted  by  the  ARI.  It  is  recommended  that  both  factors 
be  more  closely  examined  to  determine  if  remedial  steps  are  required  to 
achieve  more  accurate  scores  on  this  subtest. 

The  analyses  of  the  test-retest  data  for  applicants  revealed  the 
potential  impact  of  practice  effects  on  increasing  scores  for  speeded 
subtests  of  the  ASVAB.  A  recommended  solution  to  this  problem  is  the 
introduction  of  a  practice  test  prior  to  the  administration  of  the  initial 
battery  so  as  to  minimize  the  potential  for  large  differences  in  the 
test-retest  scores  which  may  be  attributable  to  such  an  effect. 

Finally,  an  assessment  of  the  scoring  error  rate  using  the  AFQT  and  Army 
Combat  Composite  resulted  in  small  percentages  for  both  composites. 
Nevertheless,  it  is  recommended  that  these  data  be  examined  more  closely  to 
determine  the  effects  of  errors  in  scoring  the  various  subtests,  using  more 
robust  techniques,  such  as  polynomial  models  to  determine  the  effects  of  the 
error  by  individual  subtests  and  processing  stations. 
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FREQUENCIES  FOR  ASVAB  SUBTESTS: 
RETESTED  APPLICANT  POOL 
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MEAN  RETEST  SCORES  FOR  ASVAB  S 
RETESTED  APPLICANT  POO 
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